Denotational semantics

In computer science, denotational semantics (initially known as mathematical semantics or Scott–Strachey semantics) is an approach to formalizing the meanings of programming languages by constructing mathematical objects (called denotations) which describe the meanings of expressions from the languages. Other approaches to providing a formal semantics of programming languages include axiomatic semantics and operational semantics.

Broadly speaking, denotational semantics is concerned with finding mathematical objects called domains that represent what programs do. For example, programs (or program phrases) might be represented by partial functions, or by Actor event diagram scenarios, or by games between the environment and the system: these are all general examples of domains.

An important tenet of denotational semantics is that semantics should be compositional: the denotation of a program phrase should be built out of the denotations of its subphrases.

Contents

Historical development

Denotational semantics originated in the work of Christopher Strachey and Dana Scott in the late 1960s.[1] As originally developed by Strachey and Scott, denotational semantics provided the denotation (meaning) of a computer program as a function that mapped input into output.[2] To give denotations to recursively defined programs, Scott proposed working with continuous functions between domains, specifically complete partial orders. As described below, work has continued in investigating appropriate denotational semantics for aspects of programming languages such as sequentiality, concurrency, non-determinism and local state.

Denotational semantics have been developed for modern programming languages that use capabilities like concurrency and exceptions, e.g., Concurrent ML,[3] CSP,[4] and Haskell.[5] The semantics of these languages is compositional in that the denotation of a phrase depends on the denotations of its subphrases. For example, the meaning of the applicative expression f(E1,E2) is defined in terms of semantics of its subphrases f, E1 and E2. In a modern programming language, E1 and E2 can be evaluated concurrently and the execution of one them might affect the other by interacting through shared objects causing their denotations to be defined in terms of each other. Also, E1 or E2 might throw an exception which could terminate the execution of the other one. The sections below describe special cases of the semantics of these modern programming languages.

Denotations of recursive programs

A denotational semantics is given to a program phrase as a function from an environment (that has the values of its free variables) to its denotation. For example, the phrase n*m produces a denotation when provided with an environment that has binding for its two free variables: n and m. If in the environment n has the value 3 and m has the value 5, then the denotation is 15.

A function can be modeled as denoting a set of ordered pairs where each ordered pair in the set consists of two parts (1) an argument for the function and (2) the value of the function for that argument. For example the set of order pairs {[0 1] [4 3]} is the denotation of a function with value 1 for argument 0, value 3 for the argument 4, and is otherwise undefined.

The problem to be solved is to provide denotations for recursive programs that are defined in terms of themselves such as the definition of the factorial function as

factorial ≡ λ(n) if (n==0) then 1 else n*factorial(n-1).

A solution is to build up the denotation by approximation starting with the empty set of order pairs (which in set theory would be written as {}). If {} is plugged into the above definition of factorial then the denotation is {[0 1]}, which is a better approximation of factorial. Iterating: If {[0 1]} is plugged into the definition then the denotation is {[0 1] [1 1]}. So it is convenient to think of an approximation to factorial as an input F in the following way:

λ(F) λ(n) if (n==0) then 1 else n*F(n-1).

It is instructive to think of a chain of "iterates" where Fi indicates i-many applications of F.

The least upper bound of this chain is the full factorial function which can be expressed as follows where the symbol "⊔" means "least upper bound":

\bigsqcup_{i \in \mathbb N} F^i(\{\}).

Denotational semantics of non-deterministic programs

The concept of power domains has been developed to give a denotational semantics to non-deterministic sequential programs. Writing P for a power domain constructor, the domain P(D) is the domain of non-deterministic computations of type denoted by D.

There are difficulties with fairness and unboundedness in domain-theoretic models of non-determinism.[6] See Power domains for nondeterminism.

Denotational semantics of concurrency

Many researchers have argued that the domain theoretic models given above do not suffice for the more general case of concurrent computation. For this reason various new models have been introduced. In the early 1980s, people began using the style of denotational semantics to give semantics for concurrent languages. Examples include Will Clinger's work with the actor model; Glynn Winskel's work with event structures and petri nets;[7] and the work by Francez, Hoare, Lehmann, and de Roever (1979) on trace semantics for CSP.[8] All these lines of inquiry remain under investigation (see e.g. the various denotational models for CSP[4]).

Recently, Winskel and others have proposed the category of profunctors as a domain theory for concurrency.[9][10]

Denotational semantics of state

State (such as a heap) and simple imperative features can be straightforwardly modeled in the denotational semantics described above. All the textbooks below have the details. The key idea is to consider a command as a partial function on some domain of states. The denotation of "x:=3" is then the function that takes a state to the state with 3 assigned to x. The sequencing operator ";" is denoted by composition of functions. Fixed-point constructions are then used to give a semantics to looping constructs, such as "while".

Things become more difficult in modelling programs with local variables. One approach is to no longer work with domains, but instead to interpret types as functors from some category of worlds to a category of domains. Programs are then denoted by natural continuous functions between these functors.[11][12]

Denotations of data types

Many programming languages allow users to define recursive data types. For example, the type of lists of numbers can be specified by

datatype list = Cons of (Nat, list) | Empty.

This section deals only with functional data structures that cannot change. Conventional imperative programming languages would typically allow the elements of such a recursive list to be changed.

For another example: the type of denotations of the untyped lambda calculus is

datatype D = (D → D)

The problem of solving domain equations is concerned with finding domains that model these kinds of datatypes. One approach, roughly speaking, is to consider the collection of all domains as a domain itself, and then solve the recursive definition there. The textbooks below give more details.

Polymorphic data types are data types that are defined with a parameter. For example, the type of α lists is defined by

datatype α list = Cons of (α, α list) | Empty.

Lists of numbers, then, are of type Nat list, while lists of strings are of String list.

Some researchers have developed domain theoretic models of polymorphism. Other researchers have also modeled parametric polymorphism within constructive set theories. Details are found in the textbooks listed below.

A recent research area has involved denotational semantics for object and class based programming languages.[13]

Denotational semantics for programs of restricted complexity

Following the development of programming languages based on linear logic, denotational semantics have been given to languages for linear usage (see e.g. proof nets, coherence spaces) and also polynomial time complexity.[14]

Denotational semantics of sequentiality

The problem of full abstraction for the sequential programming language PCF was, for a long time, a big open question in denotational semantics. The difficulty with PCF is that it is a very sequential language. For example, there is no way to define the parallel-or function in PCF. It is for this reason that the approach using domains, as introduced above, yields a denotational semantics that is not fully abstract.

This open question was mostly resolved in the 1990s with the development of game semantics and also with techniques involving logical relations.[15] For more details, see the page on PCF.

Denotational semantics as source-to-source translation

It is often useful to translate one programming language into another. For example, a concurrent programming language might be translated into a process calculus; a high-level programming language might be translated into byte-code. (Indeed, conventional denotational semantics can be seen as the interpretation of programming languages into the internal language of the category of domains.)

In this context, notions from denotational semantics, such as full abstraction, help to satisfy security concerns.[16][17]

Abstraction

It is often considered important to connect denotational semantics with operational semantics. This is especially important when the denotational semantics is rather mathematical and abstract, and the operational semantics is more concrete or closer to the computational intuitions. The following properties of a denotational semantics are often of interest.

  1. Syntax independence: The denotations of programs should not involve the syntax of the source language.
  2. Soundness: All observably distinct programs have distinct denotations;
  3. Full abstraction: Two programs have the same denotations precisely when they are observationally equivalent. For semantics in the traditional style, full abstraction may be understood roughly as the requirement that "operational equivalence coincides with denotational equality". For denotational semantics in more intensional models, such as the Actor model and process calculi, there are different notions of equivalence within each model, and so the concept of full abstraction is a matter of debate, and harder to pin down. Also the mathematical structure of operational semantics and denotational semantics can become very close.

Additional desirable properties we may wish to hold between operational and denotational semantics are:

  1. Constructivism: Constructivism is concerned with whether domain elements can be shown to exist by constructive methods.
  2. Independence of denotational and operational semantics: The denotational semantics should be formalized using mathematical structures that are independent of the operational semantics of a programming language; However, the underlying concepts can be closely related. See the section on Compositionality below.
  3. Full completeness or definability: Every morphism of the semantic model should be the denotation of a program.[18]

Compositionality

An important aspect of denotational semantics of programming languages is compositionality, by which the denotation of a program is constructed from denotations of its parts. For example consider the expression "7 + 4". Compositionality in this case is to provide a meaning for "7 + 4" in terms of the meanings of "7", "4" and "+".

A basic denotational semantics in domain theory is compositional because it is given as follows. We start by considering program fragments, i.e. programs with free variables. A typing context assigns a type to each free variable. For instance, in the expression (x + y) might be considered in a typing context (x:nat,y:nat). We now give a denotational semantics to program fragments, using the following scheme.

  1. We begin by describing the meaning of the types of our language: the meaning of each type must be a domain. We write 〚τ〛 for the domain denoting the type τ. For instance, the meaning of type nat should be the domain of natural numbers: 〚nat〛=ℕ.
  2. From the meaning of types we derive a meaning for typing contexts. We set 〚 x11,..., xnn〛 = 〚 τ1〛× ... ×〚τn〛. For instance, 〚x:nat,y:nat〛= ℕ×ℕ. As a special case, the meaning of the empty typing context, with no variables, is the domain with one element, denoted 1.
  3. Finally, we must give a meaning to each program-fragment-in-typing-context. Suppose that P is a program fragment of type σ, in typing context Γ, often written Γ⊢P:σ. Then the meaning of this program-in-typing-context must be a continuous function 〚Γ⊢P:σ〛:〚Γ〛→〚σ〛. For instance, 〚⊢7:nat〛:1→ℕ is the constantly "7" function, while 〚x:nat,y:natx+y:nat〛:ℕ×ℕ→ℕ is the function that adds two numbers.

Now, the meaning of the compound expression (7+4) is determined by composing the three functions 〚⊢7:nat〛:1→ℕ, 〚⊢4:nat〛:1→ℕ, and 〚x:nat,y:natx+y:nat〛:ℕ×ℕ→ℕ.

In fact, this is a general scheme for compositional denotational semantics. There is nothing specific about domains and continuous functions here. One can work with a different category instead. For example, in game semantics, the category of games has games as objects and strategies as morphisms: we can interpret types as games, and programs as strategies. For a simple language without general recursion, we can make do with the category of sets and functions. For a language with side-effects, we can work in the Kleisli category for a monad. For a language with state, we can work in a functor category. Milner has advocated modelling location and interaction by working in a category with interfaces as objects and bigraphs as morphisms.[19]

Semantics versus implementation

According to Dana Scott [1980]:

It is not necessary for the semantics to determine an implementation, but it should provide criteria for showing that an implementation is correct.

According to Clinger (1981):

Usually, however, the formal semantics of a conventional sequential programming language may itself be interpreted to provide an (inefficient) implementation of the language. A formal semantics need not always provide such an implementation, though, and to believe that semantics must provide an implementation leads to confusion about the formal semantics of concurrent languages. Such confusion is painfully evident when the presence of unbounded nondeterminism in a programming language's semantics is said to imply that the programming language cannot be implemented.

Connections to other areas of computer science

Some work in denotational semantics has interpreted types as domains in the sense of domain theory which can be seen as a branch of model theory, leading to connections with type theory and category theory. Within computer science, there are connections with abstract interpretation, program verification, and model checking.

Monads were introduced to denotational semantics as a way of organising semantics, and these ideas have had a big impact in functional programming (see monads in functional programming).

References

  1. ^ Dana S. Scott. Outline of a mathematical theory of computation. Technical Monograph PRG-2, Oxford University Computing Laboratory, Oxford, England, November 1970.
  2. ^ Dana Scott and Christopher Strachey. Toward a mathematical semantics for computer languages Oxford Programming Research Group Technical Monograph. PRG-6. 1971.
  3. ^ John Reppy "Concurrent ML: Design, Application and Semantics" in Springer-Verlag, Lecture Notes in Computer Science, Vol. 693. 1993
  4. ^ a b A. W. Roscoe. "The Theory and Practice of Concurrency" Prentice-Hall. Revised 2005.
  5. ^ Simon Peyton Jones, Alastair Reid, Fergus Henderson, Tony Hoare, and Simon Marlow. "A semantics for imprecise exceptions" Conference on Programming Language Design and Implementation. 1999.
  6. ^ Paul Blain Levy: Amb Breaks Well-Pointedness, Ground Amb Doesn't. Electr. Notes Theor. Comput. Sci. 173: 221-239 (2007)
  7. ^ Event Structure Semantics for CCS and Related Languages. DAIMI Research Report, University of Aarhus, 67 pp., April 1983.
  8. ^ Nissim Francez, C.A.R. Hoare, Daniel Lehmann, and Willem-Paul de Roever. Semantics of nondeterminism, concurrency, and communication Journal of Computer and System Sciences. December 1979.
  9. ^ Gian Luca Cattani, Glynn Winskel. Profunctors, open maps and bisimulation. Mathematical Structures in Computer Science, 15(3):553–614 (2005).
  10. ^ Mikkel Nygaard, Glynn Winskel: Domain theory for concurrency. Theoretical Computer Science, 316(1):153–190 (2004).
  11. ^ Peter W. O'Hearn, John Power, Robert D. Tennent, Makoto Takeyama. Syntactic control of interference revisited. Electr. Notes Theor. Comput. Sci. 1. 1995.
  12. ^ Frank J. Oles. A Category-Theoretic Approach to the Semantics of Programming. PhD thesis, Syracuse University, New York, USA. 1982.
  13. ^ Bernhard Reus, Thomas Streicher. Semantics and logic of object calculi. Theor. Comput. Sci., 316(1):191–213 (2004).
  14. ^ P. Baillot. Stratified coherence spaces: a denotational semantics for Light Linear Logic (ps.gz) Theoretical Computer Science , 318 (1-2), pp. 29-55, 2004.
  15. ^ P. W. O'Hearn and J. G. Riecke. Kripke Logical Relations and PCF, Information and Computation, 120(1):107–116 (July 1995).
  16. ^ Martin Abadi. Protection in programming-language translations. Proc. of ICALP'98. LNCS 1443. 1998.
  17. ^ Andrew Kennedy. Securing the .NET programming model. Theoretical Computer Science, 364(3). 2006
  18. ^ Curien, Pierre-Louis (2007). "Definability and Full Abstraction". Electronic Notes in Theoretical Computer Science (Papers in honour of Gordon Plotkin: Elsevier) 172: 301–310. doi:10.1016/j.entcs.2007.02.011. 
  19. ^ The Space and Motion of Communicating Agents. Robin Milner. Cambridge University Press, 2009, ISBN 9780521738330, 2009 draft.

Further reading

Textbooks
Lecture notes
Other references

External links